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Nonlinear effects in protein dynamics are expected to play role in fnnction, particularly of allosteric 
nature, by facilitating energy transfer between vibrational modes. A recently proposed method fo¬ 
cusing on the non-Gaussian shape of the population near equilibrium projects this information onto 
real space in order to identify the aminoacids relevant to function. We here apply this method 
to three ancestral proteins in glucocorticoid receptor (GR) family and show that the mutations 
that restrict functional activity during GR evolution correlate significantly with locations that are 
highlighted by the nonlinear contribution to the near-native configurational distribution. Our find¬ 
ings demonstrate that nonlinear effects are not only indispensible for understanding functionality in 
proteins, but they can also be harnessed into a predictive tool for functional site determination. 


A. Introduction 

Mechanisms of information transfer and function in 
proteins continue to be challenging problems where dif¬ 
ferent points of view compete. The ensemble view, i.e., 
that a ligand binding event triggers allostery by mod¬ 
ifying the free energy landscape is now a commonly 
recognized paradigm EH!- The so called “population 
shift” picture is helpful in understanding allostery with¬ 
out shape change and finds support from recent NMR 
studies Q. Models which focus on mechanistic aspects, 
such as the suppression of a certain vibrational mode 
or energy transfer between two modes are also 

widely employed, thanks to the intuitive picture they of¬ 
fer. Through such models, it is, for example, possible to 
discuss positive/negative allostery 0 - 

A recently proposed approach that rests on dynamical 
simulations around equilibrium delivers a tool for predict¬ 
ing locations relevant to mode coupling 0[il. The cen¬ 
tral idea of the method is to quantify the nonlinear contri¬ 
bution (required for energy transfer between vibrational 
modes) to the near-native configurational probability dis¬ 
tribution function and identify the residues on which it 
has the highest impact. Present work is an application of 
this method to a set of ancestral glucocorticoid receptor 
(GR) proteins and observes a statistically significant cor¬ 
relation between locations underlined by mode coupling 
and function changing mutations that took place during 
the evolution of the GR proteins. Multiple molecular 
dynamics (MD) trajectories of ~ O.lpis for each protein 
further allow us to discuss the robustness of the method 
between independent MD runs, as well as the sensitivity 
of our findings to the simulation length. 


B. Extracting information on mode coupling 


Consider a protein composed of N aminoacids. Let 
the Cartesian coordinates of carbon-alpha {Ca) atoms 
be stored in the vector R of length 3N, encoding a 
coarse representation of the protein’s spatial arrange¬ 


ment, or configuration. The configurational probabil¬ 
ity distribution, p{R) can be derived numerically from 
an X 3N real-valued matrix, where A4 is the num¬ 
ber of snapshots acquired by sampling sufficiently long 
MD trajectories. This configurational distribution is then 
used to determine fluctuations around the mean struc¬ 
ture, 6R = R — {R), where (•) indicates averaging over 
time and multiple MD trajectories generated using dif¬ 
ferent random seeds. 

Within the framework of elastic network models, the 
configurational distribution is most conveniently ex¬ 
pressed in terms of “modal fluctuations” 


Sr = T-^/^SR (1) 

where T = (SRSR^) is the covariance matrix associated 
with real-space fluctuations. A general analytical expres¬ 
sion for p{dr) in terms of Hermite tensor polynomials was 
originally proposed by Flory [T^ : 

oo 
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where Hi, denotes the Hermite tensor polynomial of rank 
V, and the coefficients Cy follow from the orthogonality 
relation f Hy{Sr)Hf^{dr)dr = {v\Y^ 

In {r} basis, all expansion coefficients Cj^/o in Eq.([2]) 
vanish for a perfectly elastic network. As a result, the 
configurational distribution of such a linear system is 
separable into 3N identical Gaussian functions with unit 
standard deviation and zero mean: 


3N 

p(9)(^Sr) = 

m—1 


exp[-(i5r^)^/2] 


( 3 ) 


The superscript implies the Gaussian product form of 
p. All vibrational modes of the linear system are repre¬ 
sented in an identical fashion in this normalized form of 
the distribution. Yet, their physical difference is evident 
from and encoded in the corresponding eigenvalues and 
eigenvectors of T. 




Nonlinearity can be introduced into this picture with¬ 
out coupling the vibrational modes, by simply adding 
higher-order terms to the corresponding harmonic oscil¬ 
lator Hamiltonians: 


then be expressed as 

p(®)(,5r) = W_Pm{5rm) ■ (5) 
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where 


Pm{Srm) 


exp[-(fr^)^/2] 


l + Y,CHA6rm) 

n—3 
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where pm and km are the effective mass and the spring 
constant for the mode and ctri.m indicate the strength 
of higher-order terms in appropriate units, all of which 
can in principle be derived from the underlying dynamics. 
The resulting conhgurational distribution function can 


Note that, H^, above is now the ordinary Hermite poly¬ 
nomial of rank ly. Eqs. (15161) describe the most general 
separable distribution for the variables {5rm}, hence the 
superscript 

An arbitrary configurational distribution can be ex¬ 
pressed in a similar fashion, starting from Eq.(I2]) fTl| : 
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Eq.(0 allows one to address all nonlinear corrections to 
the Gaussian description of near-native fluctuations in 
a methodical, order-by-order fashion. For the type of 
nonlinearity which yields Eq. ([S]) in a suitably chosen ba¬ 
sis {r} we propose the term “marginal anharmonicity”, 
because it is fully characterized by marginal distribu¬ 
tions Pmidrm)- In other words, all coefficients 
and their higher-order counterparts in Eq.(I7|) can be de¬ 
rived in this case from the coefficients of “one-body” 
terms c™ = {H,j{5rm))/v^-- For example, = 

On the other hand, the diagonalization 
problem, that is, determining the transformation i? —> r 
for an arbitrary marginally anharmonic system is signif¬ 
icantly harder than that for a linear one. (This problem 
arises naturally in signal processing field and is known as 
“Independent Component Analysis” or ICA Ho 

Deviations in p{r) from the separable form in Eq.([5|) 
are, by construction, due to coupling between (possibly 
anharmonic) vibrational modes. In fact, when such cou¬ 
pling is strong, the solutions of the nonlinear system will 
depart significantly from those found in the noninter¬ 
acting limit in Eq.([5]); so that, describing the dynamics 
around the vibrational “modes” in Eq.(|4|) may no more 
be justihed. However, numerous earlier studies suggest 
that a quasi-elastic treatment of protein dynamics is an 
adequate and fruitful course at physiological tempera¬ 
tures [Tsl - fisll . Therefore, in the following we assume 
that the eigenvectors of T in Eq.([T|) continue to serve 
as a meaningful basis in which Pm(?'m) calculated along 
the eigen-directions yield a good approximation for the 
best marginally anharmonic description of the sys¬ 
tem. Our results reported below corroborate this expec¬ 
tation. 

While the degree of coupling between mode pairs (also 
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triplets, quartets and so on) can be discussed by investi¬ 
gating corresponding high-order terms in Eq.(I71), it was 
shown earlier that a cumulative treatment of all cou¬ 
pling corrections to Eq.([S|) is not only possible but also 
easier [l^ . This is done by quantifying the difference 
between the near-native conformational distribution ob¬ 
tained from MD (which, in principle, includes effects at 
all orders) and the best description of the data in terms 
independent modes, as given by Eq.([5]). 

A motivation for studying mode coupling is to gain 
insight about the mechanism and regulation of function 
in allosteric proteins. Accordingly, one is typically in¬ 
terested in locations on the structure where mutations or 
immobilization by means of ligand binding alter protein’s 
activity. After isolating multi-mode contributions to fluc¬ 
tuations as described above, the next step is therefore to 
project this information onto the protein structure in or¬ 
der to identify regions which are critical for mediating 
the coupling between relevant modes. To this end, we 
first map the distribution back onto Cartesian space 
by 


p^^\6R) = p^^\5r)/Vdet T . (8) 

Next, the difference between p{dRi) and p^^'>{SRi) is 
measured along each Cartesian component of the position 
vector Ri of residue i by means of the Jensen-Shannon 
(JS) divergence djs[p,p^^^] defined as 

djs b, ^ [dki {p, M) + dki ip'' ,M)] (9) 

where M = \{p-\-p’^) and dki{p, q) is the Kullback-Leibler 
divergence that is given by 


dki{p,q) = / p{x) In 
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( 10 ) 

















An advantage of using JS divergence is that it is symmet¬ 
ric, therefore immune to the possibility that one of the 
two distributions may vanish at certain points (Kullback- 
Leibler divergence yields infinity for such instances). 

Following the recipe above, we calculate the mode- 
coupling score for each amino acid in a protein, as the 
sum of JS divergences calculated along the components 
of Ri. Note that, comparing p and aminoacid by 
aminoacid rather than in their full domain not only yields 
a testable output (locations relevant to mode coupling in 
the protein), but is also more robust to stochastic fluctu¬ 
ations inherent to the method, simply because of reduced 
dimensionality. That is, histograms for spatial fluctua¬ 
tions of single aminoacids can be represented with much 
fewer bins compared to joint distributions p or for the 
whole protein, therefore they are sufficiently well sampled 
along the MD trajectories. Variations in the output as 
a function of the simulation length as well as between 
independent MD runs are investigated in Section FTO 

An earlier version of this prescription was applied to 
motor protein myosin II with encouraging results [l^ . 
We here consider the ancestral chain of GR proteins 
and show that the outlined analysis reveals a significant 
bias in mode-coupling scores for experimentally validated 
function-altering mutations in the family. 


C. Glucocortocoid receptors and evolutionary data 

Glucocorticoid receptors are a class of endogenous 
steroid hormones that regulate imflammatory and stress 
responses, growth, development, and apoptosis [20l - l^ . 
GR positively regulates transcription through a process 
known as transactivation in which the ligand-bound GR 
dislocates from cytoplasm to enter cell nucleus where it 
activates transcription Its paralogous counterpart, 
mineralocorticoid receptor (MR) is mainly responsible for 
regulating electrolyte homeosta sis 1^ . While GR binds 
glucocorticoid hormone cortisol [^, MR acts as a host 
for aldosterone, 11-deoxycorticosterone (DOC) and with 
a lesser affinity for cortisol [s^. Through phylogenetic 
analysis, the sequence and the crystal structure of their 
common ancestor AncCR, as well as the ancestral GR 
proteins in cartilaginous fish (AncGRl) and in bony ver¬ 
tebrates (AncGR2) have been determined [Sll,!!^. It has 
been shown that AncCR, similar to AncGRl, indiscrim¬ 
inately binds to DOC, aldosterone and cortisol. On the 
other hand, AncGR2 exclusively binds cortisol and is not 
activated by aldosterone and DOC. 

Considerable effort has been devoted to understand¬ 
ing the basis of ligand specificity in the evolution of 
GR [jll - l^ . Structural variations are minute, with < lA 
RMSD difference between AncGRl and AncGR2 
Among the 36 mutations that transform AncGRl to An- 
cGR2, it has been shown that two strictly conserved mu¬ 
tations, S106P and LlllQ (group X), are sufficient to 
increase cortisol specifity. S106P changes the architec¬ 
ture of ligand binding pocket and allows Till to be lo¬ 


cated at a closer position to the ligand. The effect of 
LlllQ is biochemical rather than mechanical, since it 
creates an additional hydrogen bond between 11 IQ and 
the cortisol, which lacks in DOC and aldosterone bind¬ 
ing. Three additional mutations, L29M, F98I and S212J 
(group Y), wipe out the affinity towards DOC and aldos¬ 
terone. However, AncGRl-l-X-l-Y structure cannot ac¬ 
tivate transcription due to the damaged hydrogen bond 
network which destabilizes the activation-function helix. 
Two further mutations, N26T and Q105L (group Z), are 
necessary in order to reestablish the hydrogen bond net¬ 
work and thereby stabilize the structure. All together, 
AncGRU-X-l-Y-l-Z, captures AncGR2 phenotype. A re¬ 
cent study found that XYZ mutations correlate with a 
measure based on the difference in fluctuation amplitudes 
of a residue in principal vibrational modes of ancestral 
GR proteins [sj. 

Besides historically occuring mutations, another study 
on alternative evolutionary pathways that restore An- 
cGR2 phenotype [s^ demonstrated that, among a 
set of suggested alternatives, only the mutation pair 
Q114L/M197I recovers cortisol sensitivity similar to the 
historical set of permissive mutations, albeit with a loss 
of associated transcriptional function. 


D. Specifics on molecular dynamics simulations 

Grystal structures of AncCR, AncGRl, and AncGR2 
are publicly available at PDB with accession codes 2Q3Y, 
3RY9, and 3GN8, respectively. MD simulations of each 
were carried out on Tesla K20 GPUs W means of Am¬ 
ber 14 Molecular Dynamics Package using ffl4sb 
force-field [1^. Ligand molecules were also included in 
the simulations and their parametrization were done with 
Antechamber using generalized Amber force field [13, ■ 

All simulations were performed in (iV, P, T) ensemble 
with explicit water solvent and with Langevin dynam¬ 
ics which maintained the temperature at 310 K and the 
pressure at 1 bar. No rigid bonds were assumed. 1 fs 
timesteps were used between successive frames while tra¬ 
jectories were captured every 1000 frames, i.e. 1 ps apart, 
throughout 128 ns long simulations performed for each 
sample. Each protein was simulated four times with the 
same initial condition but different random number gen¬ 
erator seeds. Before further analysis, trajectories were 
aligned by means of the backbone Ca atoms. Discard¬ 
ing the first 7 ns of each simulation for equilibration, 
this protocol resulted in 5 x 10^ snapshots, derived from 
~ 500 ns long simulations of the near-native dynamics 
for each protein. 


E. Mode coupling in GR proteins and comparison 
with evolutionary data 

A substantial contribution from marginal anharmonic- 
ity is observable in the amplitude distributions of the 



FIG. 1. Marginal distributions for the amplitudes of six slow¬ 
est modes of AncGR2. Anharmonicty is most discernible in 
the first three and gradually disappears for faster modes. An¬ 
alytical approximations derived from Eq. m with a cut-off at 
V = 32 are also shown in red. 


slowest modes of AncGR2 protein shown in Fig. [T] 
Strong deviation from Gaussian in eigenmodes with large 
eigenvalues is typical for the whole GR family, in fact for 
most proteins [3^ 1^ . A comparison of the marginal dis¬ 
tributions and Eq.(IB]) with a cut-off at ?; = 32 

confirms that they are represented accurately by the an¬ 
alytical approach in Section [H 

On the other hand, presence of mode coupling is ev¬ 
ident from the difference between p and p^^\ as shown 
in Fig. [2] by considering joint amplitude distributions for 
(Srm^Srn) corresponding to slowest mode pairs (m,n) = 
(1, 2), (1, 3), (2, 3). Gontribution of mode coupling is ex- 
emplihed by the difference between the two rows of Fig. [21 
Information content of such deviation from marginal an- 
harmonicity in protein dynamics and its relevance to pro¬ 
tein’s biological function is our focus in this study. 

We start by asking whether certain mode pairs stand 
out in the above analysis. One can assess the overall 
impact of a mode pair by considering the (dimensionless) 
conformational free energy 

r 1 

F[p] = — p{dr)hip{dr)dSr = — ^^lnp(5ri) (11) 

calculated with and without the mode coupling contri¬ 
bution from the mode pair (m, n) in Eq. ©• Eor this 
purpose, we approximated p{5ri) by the one- and two- 
body terms spelled out in Eq.(I7|). We then defined 
P-rnni^i'i) = p(5r)I ^ „ , in wluch the mode- 

coupling contribution of the pair (m, n) is discarded. The 
difference Amn = F[p-mn] — F[p\ is a measure of the 
impact of the interaction between modes m and n on 
protein’s behavior near equilibrium. A^n for all pairs 
composed out of slowest 25 modes of each protein are 
given in Fig. O It is interesting that Amn displays a 
power-law dependence with a scaling exponent ^ —0.8 
over more than two decades on the rank order of the pair 
(m,n) (Fig. Od). An exhaustive analysis over all mode 
pairs was not performed due to its heavy computational 
cost. 


We found that the highest-impact mode pairs are 1-3 
for AncGR and AncGRl; and 1-7 and 2-6 for AncGR2. 
Spatial fluctuations associated with these mode pairs co¬ 
incide with helices 7 and 10, along with the loop region 
proceeding helix-7. Indeed, these helices form part of the 
ligand binding pocket, while the loop before helix-7 is 
where the two X mutations are located. We additionally 
observed a region on helix-9 with high sequence conser¬ 
vation score also involved in mode coupling which, to our 
knowledge, has not been highlighted in earlier studies. 

We next performed the analysis outlined in Sec¬ 
tion m in order to derive a mode-coupling score for each 
aminoacid. The resulting score vectors obtained over the 
full data set (four trajectories) separately for each mem¬ 
ber of the GR family are shown as a heat map super¬ 
imposed onto the proteins’ three-dimensional structure 
in Fig. 01 We observed that the loop (100-110) proceed¬ 
ing helix-7 yields considerably high scores in all proteins, 
despite the fact that this loop and the nearby helix-7 ex¬ 
hibit the largest structural variability between AncGRl 
and AncGR2 [s^. Furthermore, the same region also 
accommodates 4 of the 6 (XYZ-)mutations mentioned 
above. These observations hint at the relevance of mode 
coupling to the evolutionary history of function in the GR 
protein’s lineage, which we investigate below in further 
detail. 

Note that, the location of the X-mutation S106P con¬ 
sistently has one of the highest scores in all proteins. 
Gonsidering that S106P alone decreases activation in An¬ 
cGR indepedent of the ligand type suggests that 
the mechanism underlying the activity loss is mechanical 
in origin, rather than biochemical (to which the present 
method is insensitive). The opposite is true for the sec¬ 
ond X mutation LlllQ which recovers cortisol speci- 



FIG. 2. Pairwise joint probability distributions given as heat 
maps for the amplitudes of three slowest modes in AncGR2 
(high probability regions are shown in yellow). First row 
corresponds to the MD data for pairs 1-2, 1-3, and 5-6, re¬ 
spectively. Second row gives the product of corresponding 
marginal distributions. It is evident that, joint distributions 
of slow modes can not be captured by Eq.©, meaning mode¬ 
coupling corrections must be included. 













FIG. 3. (a) Pairwise mode coupling scores for slowest 25 modes of the three members of GR protein lineage, (b) Mode-coupling 
scores Amn of rank-ordered mode pairs for the three proteins. The straight line segment (black) corresponds to a power-law 
decay with an exponent —0.8. 


ficity ( 3 ^ by allowing formation of a hydrogen bond with 
cortisol. Mode-coupling score of location 111 shows no 
significant deviation from the mean. On the other hand, 
the synthetic mutations Q114L/M197I in AncGRl - that 
also recover cortisol specificity and disrupt communica¬ 
tion between cortisol binding and transcriptional activity 
- coincide with the two mode-coupling peaks in AncGRl 
located on helix-7 and helix-10. 

Complementing these observations, an objective eval¬ 
uation of the correlation between mode-coupling scores 
and the AncGRl —>■ AncGR2 mutation set is desirable. 
For this purpose, we use the recall analysis where muta¬ 
tion sites under consideration are labelled as the target 
set and their rankings are inspected in the full residue 


AncCR 


AncGRl 


AncGR2 








FIG. 4. (a) Cartoon representation of the three studies pro¬ 
teins. Helices 3 (29-54),7 (108-125), and 10 (180-210) are 
shown in red, yellow, and purple, respectively. Loop region 
preceeding helix 7 is colored in blue with activation-function 
helix (AF-h) (220-232) in magenta. Helices 3,7, and 10 along¬ 
side with helix 1 (not shown) form a part of ligand bind¬ 
ing pocket. AF-h is essential for transcriptional activity, (b) 
Mode-coupling scores mapped onto the corresponding protein 
structures where hotter colors represent higher scores. Signif¬ 
icant activity is observed on helices 7 and 10, and around loop 
regions. 




FIG. 5. Recall analysis for AncCR, AncGRl, AncGR2. For 
each protein, mode-coupling scores and B-values (rms am¬ 
plitude of Ca fluctuations) were used for ranking residues. 
Target residues were set to be XYZ-mutation locations or the 
locations of all mutations that occured between two evolution¬ 
ary steps. All figures show a significant positive bias towards 
XYZ-mutations when mode-coupling scores are used. This 
pattern is lost when B-values are used for scoring. 


list sorted according to mode-coupling scores. The result 
is presented as a recall curve which is a plot of the frac¬ 
tion of the target set elements (y-coordinate) observed 
in a given fraction (x-coordinate) of the list picked from 
the top. In absence of correlation between the target set 
and the scoring function, one expects to see the recall 
to remain on the diagonal upto statistical fluctuations. 
A recall curve remaining significantly above the diagonal 
indicates a positive correlation, since it reflects the fact 
that the aminoacids in the target set come with higher- 
than-average scores. 

Fig. [S] shows the outcome of the recall analysis for the 
three proteins in the GR family. In all cases, we observe 
no visible correlation between mode-coupling scores and 
the complete set of mutations accompanying each evolu¬ 
tionary step. Focusing on the function changing XYZ- 
mutations only, we first note an overall positive correla¬ 
tion with B-values (variance of fluctuations around equi¬ 
librium for each Ca atom), due to the fact that these mu¬ 
tations are located mostly on the loop regions. It is strik¬ 
ing that the mode-coupling based ordering yields better 
recall values in all three proteins. We furthermore ob¬ 
serve that the recall performances improve slightly when 
mode-coupling scores are divided by the B-values in or- 
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FIG. 6. Self consistency of the score vectors increases with 
the length of the MD trajectory, (cos 0)t is the average value 
of the dot product of two normalized score vectors obtained 
from different time windows of size T. 


der to factor out the bias mentioned above. This is the 
central result of the present work which, together with 
a similar observation on myosin II [l^ , lends support to 
the thesis that coupling between vibrational modes is a 
key physical mechanism in protein function. 


F. Robustness wrt data acquisition period 


Since most proteins carry out their function in time 
scales beyond the reach of computer simulations, it is 
natural to ask how sensitive above results are to the sim¬ 
ulation time window. We investigated the robustness of 
our findings by re-analyzing the data in varying time in¬ 
tervals. To this end, we divided each trajectory into Nt 
fragments of T ns each (T = 1, 2,4, 8 ,..., 128) and calcu¬ 
lated mode-coupling scores by using each fragment sepa¬ 
rately. We then compared score vectors for each interval 
pair with identical lengths by measuring the angle be¬ 
tween them. This was done by evaluating the dot prod¬ 
uct of the two vectors after setting their mean to zero (by 
a constant shift) and rescaling them to unit length. The 
mean (cos0)t and the standard devitation ctt of the ob¬ 
tained dot products were recorded separately for each in¬ 
terval length T. Results shown in Fig. [6] confirm that the 
analysis detailed in section [B] yields progressively more 


consistent results with increasing T. 


G. Conclusion 

While it is natural from a physical point of view to 
postulate that nonlinear effects mitigate energy transfer 
within a proteinQ, precisely how the nonlinearity ob¬ 
served in protein dynamics can be fruitfully exploited 
to yield biologically relevant predictions is unclear. Even 
the relevance of nonlinear effects to protein function is far 
from being universally acknowledged. While part of the 
literature (such as on discrete breathers [4ll-l44|l attests 
to its importance, there is substantial amount of past 
and recent work which investigate mechanisms of protein 
function within a linear (harmonic) framework or at the 
level of principal component analysis H, [13, [H, HE 113 ■ 
By demonstrating that functionally critical mutations 
along the evolutionary descent which relates three an¬ 
cestral proteins of the GR family are highlighted in an 
analysis of the nonlinear contribution to dynamics, the 
present work emphasizes the significance of nonlinearity, 
in particular that beyond marginal anharmonicity, to pro¬ 
tein function. 

The selective power of the mode-coupling analysis for 
functionally relevant sites (in GR protein family reported 
here and in myosin II earlier [l3|) is suggestive. However, 
it is also evident from the data that not all known func¬ 
tional locations come with high mode-coupling scores. 
Given the complexity of the system and the multitude of 
factors beyond protein dynamics that play role in func¬ 
tionality, this is only expected. Applying the analysis 
on carefully constructed toy nonlinear models may help 
clarify the mechanistic role played by the aminoacids 
that score high in the present analysis. Such information 
could be useful for characterizing the proteins on which 
the current approach may be expected to be successful 
in future. 
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